Domain Adaptation for Learning from Label Proportions Using Self-Training

نویسندگان

  • Ehsan Mohammady Ardehaly
  • Aron Culotta
چکیده

Learning from Label Proportions (LLP) is a machine learning problem in which the training data consist of bags of instances, and only the class label distribution for each bag is known. In some domains label proportions are readily available; for example, by grouping social media users by location, one can use census statistics to build a classifier for user demographics. However, label proportions are unavailable in many domains, such as product review sites. The goal of this paper is to determine whether an LLP classifier fit in one domain can be modified to classify instances from another domain. To do so, we propose a domain adaptation algorithm that uses an LLP model fit on the source domain to generate label proportions for the target domain. A new LLP model is then fit on the target domain, and this self-training process is repeated to adapt the model from source to target. Our experiments on five diverse tasks indicate an 11% average absolute improvement in accuracy as compared to using LLP without domain adaptation. In contrast to existing domain adaptation algorithms, our approach requires only label proportions in the source domain, and the results suggest that the approach is effective even when the target domain is substantially different from the source domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...

متن کامل

Sample-oriented Domain Adaptation for Image Classification

Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. The conventional image processing algorithms cannot perform well in scenarios where the training images (source domain) that are used to learn the model have a different distribution with test images (target domain). Also, many real world applicat...

متن کامل

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

A Literature Review of Domain Adaptation with Unlabeled Data

In supervised learning, it is typically assumed that the labeled training data comes from the same distribution as the test data to which the system will be applied. In recent years, machine-learning researchers have investigated methods to handle mismatch between the training and test domains, with the goal of building a classifier using the labeled data in the old domain that will perform wel...

متن کامل

An unsupervised deep domain adaptation approach for robust speech recognition

This paper addresses the robust speech recognition problem as a domain adaptation task. Specifically, we introduce an unsupervised deep domain adaptation (DDA) approach to acoustic modeling in order to eliminate the training–testing mismatch that is common in real-world use of speech recognition. Under a multi-task learning framework, the approach jointly learns two discriminative classifiers u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016